DOCUHENT RESUME 



ED 111 171 



Fl 005 512 



AOTHOR 
TITLE 

POB DATE 
NOTE 



EDES PRICE 
DESCRIPTORS 



Bordier John G. 

Direct and Indirect Measures of Language 

Proficiency. 

1 Jan 73 

lip.; Paper presented at the meeting of the 
Linguistic Association of the Southwest (LASSO) 
(January 1^ 1973); Best copy available 

BF-$0.76 Plus Postage. HC Not Available from EDRS. 
Elementary Education; Evaluation; *Language Ability; 
♦Language Proficiency; Language skills; *Language 
Tests; *Measurement Instruments; *Measure»ent 
Techniques; Test Reliability; Verbal Ability; Written 
Language 



ABSTRACT 

There is a lack of adequate measurement techniques 
for testing language proficiency. Researchers compose specific tests 
for a certain task^ but these have only limited general 
applicability. Often multiple-choice^ true-false or f ill-in-the-blank 
tests are used^ but these rely heavily on written language and are 
inadequate for those with poor written but good verbal skills. Such 
tests tend to compartmentalize learning into components and neglect 
the overall view^ and they generally reflect academic language rather 
than current vernacular. Norm-referenced, criterion-referenced and 
non-formal tests involving listing of words following a language cue 
may be affected by the individual's attitude toward being tested. 
Indirect tests may solve some of these problems by examining language 
produced in a non-test situation. Factors studied are sentence 
length, structural complexity, lexical cioice and type-token ration. 
Occurrences of linguistic mazes and culs--de-sac may also be observed. 
Second-stage indirect measures such as body language may also be 
useful. It is possible, however, that proficiency may be a mark of 
social rather than linguistic status. (CHK) 



:(e :(e :(e :(e :(e :(e :(e :(e :(e :(e :(e * :(e :(e :(e :te :te :te :te 4? * 3|e * :|e :4e ********* * 

* Documents acquired by ERIC include many informal unpublished * 

* materials not available from other sources. ERIC makes every effort * 

* to obtain the best copy available, nevertheless, items of marginal * 

* reproducibility are often encountered and this affects the quality * 

* of the microfiche and hardcopy reproductions ERIC makes available * 

* via the ERIC Document Reproduction Service (EDRS) . EDRS is not * 

* responsible for the quality of the original document. Reproductions * 

* supplied by EDRS are the be<vt that can be made from the original. * 

******************** *3|e*** ******* **********^^^******^^^^^^^^^3|e3|e3|e3|e3|e3|e3fe3fe3fe3fe3fe 



ERIC 



7f^ 



us DEPARTMENT 
EDUCATION 
NATIONAL INSTJTUTB 
EDUCATION 



BEST COPY AVAILABLE 

NT OF HEALTH. , ^ " ^ ^ J 



THIS DOCUMENT HAS SEEN REPPn I A ^ f 



.EOUrATION POSITION OR POLICY 



Legislation passed by the Texas State Legislature in 1973 calls 

for the testing of children to determine their proficiency in English 

r-H prior to their beginning a course of study in the bilingual coiroonent 

r-H 

^ of the elementary grades. This legislation specifies proficiency 

Q 

Vy f measurement at the entry point, usually kindergarten or first grade, 

and at any requested exit point up to grade six. Yet the legislature 
in establishing such a requirement was not aware of the general lack 
of adequate measurement techniques nor did it indicate developmental 
factors were of iroportance at the age being examined. In a previous 
study of language proficiency measures used in 200 large scale educa- 
tional experiments, no one measure was used in more than five of the 
experiments. Most research^ chose to use techniques designed for the 
experiment at hand thereby indicating their disagreement with or their 
lack of faith in the reliability of existing measures. Whether this 
signals a lack of trust for all such measures or is symptoniatic of 
uncertainty about factors which are to be measured is not clear but 
if satisfactory measures are availably researchers do not concede their 
existence . Reasons for such distrust apparently derive from the feeling 
expressed by Page* not whether the test measures '*for what it measures 
is very well measured, but whether it is measuring the right thing^\ 
{F,B. Page, Iowa Test, Revised ed. , Euros 6th Mental Measurerr.er.t Handbook, 
page 51). Tlia result La the development of many instrun:ents for hi^rhly 
: -iCeoific purposes having only Limited general applicabilitv . W:.th such 

. i lack of <'.r:r,preh3nsive tests nany investigators ha^e chaser- tD use -juic^K 

ERIC ^ 



or surface measures in their evaluation of proficiency. Because there 
are considerations of tiraa or rrx^ney, many of these devices have been 
of the paper-pencil variety which evaluate language proficiency on the 

basis of a two or more choice multiple answer test. These include the 

1 2 
True-False question, the multiple answer question, fill in the blank 

3 c^^^ 

from a preselected list or on the basis of the partial -a, and open 

ended questions which may be responses to a reading passage,^ a call for 

5 

summary or paraphrase, or an essay type question. 

These techniques are limited in their heavy emphasis on the written 
language and are useful only with older and/or literate individuals ca- 
pable of handling a written stimulus. Such techniques are also limit- 
ing for individuals in a culture which stresses oral capability and 
uiideremphasizes the ability to write, current teenage America for in- 
stance. Where the individual is literate but uses a writing system which 
is only partially suitable, bridging techniques using a transcriptional 
system of some variety may be employed - as, for instance, in attemp- 
ting to measure language progress for an American student in a Japa- 
nese language program - such a technique is unsatisfactory for it in- 
volves a third writing system and places a premium on the ability to 
handle the interchange rapidly. Where measures concentrate on oral per- 
formance, the T^ethod of handling a recording device or one*s reaction to 
being recorded may become problematic. 

Oral and written tests of this variety have several defects. The 
most apparent is the arbitrary segmentation of a unified competence into 
a series of small, often unrelated components. Such segmentation is jus- 
tified to the extent that one can measure only one thing at a time. Yet 



3 



3 



the fallacy is immediately apparent: when one pronounces, one must pro- 
nounce something; when one uses the past tense, one must also control 



through all of the artifacts of language simultaneously. By examining 
the segments one tends not to see the unity joining the individual items - 
The most severe outcome of this is the general success of a student in 
his classwork and his subsequent failure to put the pieces together. 

Another limitation is that most proficiency measures tend to re- 
flect academic language usage even when efforts are made to prelfent such 
a happening. It is the exceptional test which reflects and investigates 
current usage. Slang, fad terms, in group language: all these are noto- 
riously transient. By the time such material appears in a test it is 
generally outdated and has become formalized: where a test measures 
lexicon at all it tends to measure mastery of the obsolete or the un- 
usual . 

Most such tests of language proficiency are surface measures. That 
is they rely on language which has been produced in response to a specific 
cue. Such material is then examined with a view to its standing with re- 
spect to an earlier performance by the individual or by some other person 
or group chosen as a standard. In this sense a surface test can be said 
to be norm referenced wher^ the degree of similarity or lack of similar- 
ity defines the level of accomplishment. Other samples may be compared 
with some predetermined standard on an acceptable or unacceptable basis. 
These are criterion referenced. Through appropriate techniques, values 
for criterion referenced tests can be converted to those for norn ref 3r- 
enced tests and vice versa, Whether norm or criterion r.^.ferenced these 



the syntax; when one chooses a specific meaning^ 




4 



V • 

' if 



tests are direct because material is elicited for examincition. Such 
tests involve individual awareness of the test in progress causing in 
many cases, a higher or lower rate of performance than is usual for the 
individual. Additionally such tests tend:, unconsciously or otherwise, 
to use an unspecified criterion as one referent; the standard or scho- 
lastic language of one investigator. Few tests are concerned with^or 
use ^nonstandard language. 

Along with the formal direct measures, a variety of non-formal 
tests have been developed. These non-formal tests generally are non- 
threatening and may even be regarded as^fun'by the individual being 
examined. Non-formal measures are in general as reliable as formal 
measures though they are regarded with less favor perhaps because they 
are amusing and do not conform to subconscious attitudes about the 
serious nature of a test. 

8 

An individual may be asked to list as many words as possible, 

either orally or in writing as an index of lexical availability. He 

9 

may also be provided with a formal Cue which may be graphic - write as 
many words as you can think of which begin with the letter J - ; phonetic- 
say as many words as you can which begin with 1 3 ^ "^^ or semantically 
categorized - adjectives or color words, etc."^"^ The individual may be 

asked to rearrange something - how many words or phrases can you make 

12 

out of the word PRESTO without repeating any letter (to date I've 
found 12 six letter combinations such as POSTER REPOTS etc.). Such 
techniques inventory lexical awareness. 

13 

A particular favorite is a Random word list easily available and 
with enormous variation. An individual has only to underline all the words 

5 



ERIC 



5 



he recognized in a section to indicate intmediately his recognition per- 
centage. Or he may be asked to define words in another section for a 
measure of semantic availability and ability to verbalize. In a random 
sample, any percentage of the sample is equal to that percentage of tho 
total. Despite their ease of use non- formal measures suffer from the 
same drawbacks as do formal tests: individual awareness of the test 
and the tensions incidental to such measurement. 

Indirect measxores niay be used to solve these problems. Such meas-* 
ures are useful when we wish to examine language previously produced in 
a non-test situation. This provides a more normal sample whose char- 
acteristics are not contaminated by extraneous concerns to language. 

A number have been of considerable success. Sentence^^ or utterance 

length correlates very well with actual ability in both oral and written 

language. Length in written language may not indicate a high level of 

proficiency in oral language or the reverse but length of sentence 

wherever found is a significant indicator. Graduate students seem, thus, 

to be the most proficient. To guard against the inflationary aspects of 

list sentences, it is preferable to use me«ft sentence length vuchev than 

the overall average. Closely connected with sentence length is structural 
16—17 

complexity in which the ratio of complex sentences to simple or com- 

pound sentences is an indication of language mastery. Complex here means 
any sentence with subordination no matter of what variety. A measure 
related to this of recent popular interest is T(erminal) unit length in 
which^^ ^ the mean iengrh of a main clause plus any associated clauses 
becomes significant of proficiency as the number increases. The ability 
to handle a sentence v^'hich is structurally complex, indi^at^sd by clause 

G 



6 



length, is an accepted mark of language proficiency in this society. 

Precise noncomplex short sentences are also valued • For these sentences, 

lexical choice is significant with use of words expressing tentativeness , 

relations between items in a list or a sequence, and conditionality 

along with a varied vocabulary indicating a greater proficiency. Here 

the use of "rare" words is deceptive for most words are rare. Francis 

and Kufiera indicate in their study of English that most words have a 

frequency of occurrence in running text of only 1 1/2 to 2 per million. 

A more satisfactory category is the type-token ratio based on a sample 

18 

of 2000-3000 words. In such a passage, the number of different words 
or types are indicated, then the number of words or tokens are noted. 
The proportionate ratio established indicates the degree of lexical, and 
to a certain extent, syntactic mastery. 

19 

Another measure is maze frequency. A maze is a verbal tangle 

which the s^'eaker or writer is unable to resolve with ease and must, 

as a consequence, use a large number of items to express his views. 

Economy of effort is a factor here. Related to the maze is the occur- 

20 

ence of a cul-de-sac in which the language user, faced by a particular 
maze abandons the attempt for a start at a different point. The lower 
the frequency of mazes and cul-de-sacs the more proficient the indi- 
vidual. Mean length of both maze and cul-de-sac also decrease with in- 
creasing proficiency. It should be noted that hesitation pauses in oral 
language do not conform to maze or cul-de-sac frequency* Pause in speech 
may be (and usually is) an indication of ordering; rehearsing, or search 
reflecting an individual's preparation for response rather than an in- 
ability to respond. 



7 



Certain indirect r.aasures, though valuable for other purposes, have 

little CI no correlation with language proficiency: the most obvious is 

the ability to read and/or write. The so called Mechanical skills, spell- 

21 

ing and p'onctuation" in written language, and pronunciation have only 
a tenuous connection with the ability to use the language. Such areas 
of proficiency as register or stylistic choice, sernantic association, 
personal delivery, and others are only casually connected to the mechan- 
ical skills. 

There are second stage indirect measures which are also useful in 
evaluating proficiency. These measures examine performance in non-lan- 
guage areas for inferential evaluation of language use. The nost obvi- 
ous is body language where movement, attitude, gesture and other similar 
activities function as counterpoint to actual language use. For written 
language, the quality of the writer's hand or the choice of perfumed 
paper are further factors relating to the ability to communicate. Less 
obvious though equally important are such items as sex - women being 
generally more proficient than men, - age of associates - through age 
30, the -jlder ones associates compared to the individuals age, the more 
proficient the individual. Other items such as socio-economic status, 
parental occupation, regional residence, rural/urban distinction, and 
environr.ental characteristics are also significant. 

Su n second stage indirect ^iiMiktob have been attacked from several 
siirectic-r.i: because they describe proficiency from a variable non objective 
point -^1 view which tends to exclude all items not confcrming to exprr^ted 
n^^rr.s. T'.ese criticiGms are unnecessarily T.is^ieadin;^, frr th^re exists no 
oh^ertiv-' -tescription of languar/-> proficiency at this ti:r.*. (F-r that 

8 

o 

ERIC 



ir.attt:i* a definition of iangurig^r is still .;ravaii.abl^; . Evir. should a 
definition beconiCi available, social factors only siie;htly understood 
at present would have to be considered. In closing I would like to 
quote Leonard Bloomfiald ^ Jn his article "Litarate anJ Illiterate 
Speech," ( American Speech ( 1927)2: U32-W9), referring to his work aincng 
the Menomini he states "The nearest approach to an explanation of 'good' 
and 'bad' language seems to be this, then, that by a cumulation of ob- 
vious superiorities both of character and standing, as well as of lan- 
guage, some parsons are felt to be better models of conduct and speech 
than others. Therefore even in natters where the preference is not 
obvious, the forms which these persons use are felt to have the better 
flavor. This may be a generally hur^n state of affairs true in every 
group and applicable to all languages." End quote. Proficiency then may 
be a mark of social rather than linguistic status. 



?£FE FENCES 



FishT.an ar.vl Cooper. ''Alternative Means of Bilingualis:-,'' :VLVB VIII: 
2^6-82 (1959). 

FitTs ar.d Posner- Hun:an Perforriance . Brooks Cole, 196t. 

Hunt, '^^3ra^i:^*atical Structures Written at Three Grade Levels/' NCTE 
(1955). " ' ^-^^ 

Lo::an studies. 

McCarthy, Dorothea. ^'Language Development in Children," in Manual of 
Child Psychology' , Leonard Carmichael, ed. . New York: John Wiley 
and Sons, 195i+. 

Scherer, leorge A. and Michael Wertheiner. A Psycholinguis-cic Experir^ent 
in Foreign Language Teaching . New York: McGraw-Hill Book Co., 195^. 

Valette, Rebecca. Modern Language Testing . New York: Harcourt, Brace 
and World, 1957. 

Yamagiwa. ''A Checklist of Tests for Various Types of Proficiency in 
an Foreign Language," Language Learning VII: (1955-7). 



10 



V 



LASSO KAliLC-T 
11/2/73 

Direct and Indirect Measures 

1. The plural of MAN is MEN. True False 

2* Henry likes to invent excuses for not studying. 

Make up Make out Make over Make in 

3. ^she own a car? Does, Is, Has 



^. According to the paragraph, the proper way to end a question 
is with rising) falling, level intonation. 

5. Describe the pronunciation of words using the TH digraph. 

6. Ta'^ako dc^o desu ka — I vrant a cigarette. How about a cigarette? 

He smokes cigarettes, 

7. Defines 23 skiddoo, A-OK, Colorless green ideas. 

8. Write as many words as you can think of in the next three mdnutes. 

9* - Write as many words in the next five minutes as you can that begin 
with J_. 

10. Say as many words as you can which begin with 1^1. 

11. Write as many words or phrases which describe sounds adjectivally. 

12. How many different words or phrases can you find in PRESTO without 
repeating any of the letters? 

13. Random Word List from The English Speculum, J.L. Dolby and H.L. 
Resnikoff, Lockheed Missiles S Space Co., Sunnyvale, California, 196^, 

IM-. Mean written sentence length, ca. 27 words for college seniors. 

15. John and Henry and Mary went to Fort Worth to see SCMLA and Lasso 
and then went to Dallas and saw Six Flags and then went home. 

15. The nan I had seen opening the door was one of Henry '-s collegiate 
friends with whom Jim had been particularly close, 

l"'. i naw the man. He was opening the door. He was Henry's friend. 

IB. Tvpe/token ratio about U/10 for average adult. 

19. I went to see Henry which he/was had been one of my friends. 

20. I %-ent to see Henry which v/as been rrin^ 

21. Punctuate: John ;;here Her.ry had had h^id had hcid had 

^ L. rloomfield Literate and Illi";^r;-.r3 ilr.urjjch Ar.>.»r]Cdr. Sr'r»^ch 19?'' 

ERIC .- ^.u^Q ^-j^ ' 



